Rapidly Registering Identity-by-Descent Across Ancestral Recombination Graphs
نویسندگان
چکیده
The genomes of remotely related individuals occasionally contain long segments that are identical by descent (IBD). Sharing of IBD segments has many applications in population and medical genetics, and it is thus desirable to study their properties in simulations. However, no current method provides a direct, efficient means to extract IBD segments from simulated genealogies. Here, we introduce computationally efficient approaches to extract ground-truth IBD segments from a sequence of genealogies, or equivalently, an ancestral recombination graph. Specifically, we use a two-step scheme, where we first identify putative shared segments by comparing the common ancestors of all pairs of individuals at some distance apart. This reduces the search space considerably, and we then proceed by determining the true IBD status of the candidate segments. Under some assumptions and when allowing a limited resolution of segment lengths, our run-time complexity is reduced from O(n(3) log n) for the naïve algorithm to O(n log n), where n is the number of individuals in the sample.
منابع مشابه
Identity by descent: variation in meiosis, across genomes, and in populations.
Gene identity by descent (IBD) is a fundamental concept that underlies genetically mediated similarities among relatives. Gene IBD is traced through ancestral meioses and is defined relative to founders of a pedigree, or to some time point or mutational origin in the coalescent of a set of extant genes in a population. The random process underlying changes in the patterns of IBD across the geno...
متن کاملInference of Ancestral Recombination Graphs through Topological Data Analysis
The recent explosion of genomic data has underscored the need for interpretable and comprehensive analyses that can capture complex phylogenetic relationships within and across species. Recombination, reassortment and horizontal gene transfer constitute examples of pervasive biological phenomena that cannot be captured by tree-like representations. Starting from hundreds of genomes, we are inte...
متن کاملGraphML specializations to codify ancestral recombinant graphs
Software which simulates, infers, or analyzes ancestral recombination graphs (ARGs) faces the problem of communicating them. Existing formats omit information either about the location of recombinations along the chromosome or the position of recombinations relative to the branching topology. We present a specialization of GraphML, an XML-based standard for mathematical graphs, for communicatio...
متن کاملRecombination, gene conversion, and identity-by-descent at three loci.
We investigate the probabilities of identity-by-descent at three loci in order to find a signature which differentiates between the two types of crossing over events: recombination and gene conversion. We use a Markov chain to model coalescence, recombination, gene conversion and mutation in a sample of size two. Using numerical analysis, we calculate the total probability of identity-by-descen...
متن کاملAncestral Processes with Selection
In this paper, we show how to construct the genealogy of a sample of genes for a large class of models with selection and mutation. Each gene corresponds to a single locus at which there is no recombination. The genealogy of the sample is embedded in a graph which we call the ancestral selection graph. This graph contains all the information about the ancestry; it is the analogue of Kingman's c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of computational biology : a journal of computational molecular cell biology
دوره 23 6 شماره
صفحات -
تاریخ انتشار 2015